Skip to main content

Distributed Log Store

ยท 3 min read
Saurav Jaiswal
Ex-CommBank Software Engineer, MSc Computer Science @University of Warwick

Ask DeepWiki

๐Ÿ“Œ Overviewโ€‹

This project is a simplified Kafka-like distributed append-only log system built from scratch using Java 23 and Spring Boot.

It demonstrates how high-throughput, ordered writes can be achieved by enforcing single-leader writes per partition, eliminating locks while maintaining correctness, durability, and scalability.

Core idea: Ordering is enforced by a single leader per partition. All writes are append-only and sequential, enabling lock-free, high-performance ingestion.

๐ŸŽฏ Goals of the Projectโ€‹

  • Understand append-only log architecture
  • Learn partitioned concurrency without locks
  • Implement leaderโ€“follower replication
  • Demonstrate crash recovery using log replay
  • Build a systems-level project, not CRUD

๐Ÿง  Key Concepts Implementedโ€‹

ConceptDescription
Append-only logData is never updated or deleted in place
PartitioningData is sharded into independent tablets
Single leaderOne writer per partition ensures ordering
Sequential I/ODisk writes are sequential, not random
Offset-based readsConsumers read using monotonically increasing offsets
ReplicationLeader replicates log entries to followers
Crash recoveryLogs are replayed on startup

๐Ÿ—๏ธ Architectureโ€‹

High-Level Architecture Diagramโ€‹

log-server

---### Component Interaction

                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Client โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
v
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ TabletServer โ”‚
โ”‚ (Leader Instance) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€vโ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€vโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Tablet A โ”‚ โ”‚ Tablet B โ”‚
โ”‚ (Partition) โ”‚ โ”‚ (Partition) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚
Append-only Log Append-only Log
(Sequential I/O) (Sequential I/O)
โ”‚ (gRPC)-high rate โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€vโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€vโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Replica Server โ”‚ โ”‚ Replica Server โ”‚
โ”‚ (Follower) โ”‚ โ”‚ (Follower) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Core Architectural Principlesโ€‹

  • Each tablet = one partition
  • Exactly one leader per tablet
  • Only leader writes to the log
  • Replication is log-based and ordered
  • Concurrency happens across tablets, not within

๐Ÿ”‘ Why No Locks Are Neededโ€‹

  • A tablet has exactly one leader
  • Only the leader thread appends to its log
  • Offset assignment is a monotonic counter
  • No shared mutable state between tablets

Result:

  • No race conditions
  • No file-level locks
  • Extremely high write throughput

๐Ÿ“‚ Log Record Formatโ€‹

Each log entry is stored sequentially in the following format:

[offset][timestamp][keyLength][key][valueLength][value]
  • offset โ†’ Monotonically increasing
  • Records are immutable
  • New entries are appended at the end of the file

Exampleโ€‹

0|1734150400123|user-123|CREATE-TASK 1|1734150400456|user-456|UPDATE-TASK

Records are append-only and written sequentially to disk. Offsets are strictly increasing and never reused.โ€‹

๐ŸŒ REST APIs Swaggerโ€‹

Append Recordโ€‹

POST /append
{
"key": "user123",
"value": "event-data"
}

Response

{
"tabletId": 2,
"offset": 1042
}

Read Recordโ€‹

GET /read?tabletId=2&offset=1042

๐Ÿ” Replication Modelโ€‹

  • Each tablet consists of:

    • 1 leader
    • N replicas
  • Workflow:

    1. Leader appends record
    2. Offset is assigned
    3. Record is replicated to followers
    4. Followers append in the same order

This guarantees ordering, durability, and fault tolerance.


๐Ÿ’ฅ Crash Recoveryโ€‹

On startup:

  1. Log files are scanned
  2. Last offset is recovered
  3. Appends resume from the correct position

โœ” No data loss โœ” No offset duplication โœ” Safe restarts


๐Ÿ“ Project Structureโ€‹

mini-log-store/
โ”œโ”€โ”€ src/main/java/com/example/logstore
โ”‚ โ”œโ”€โ”€ api/
โ”‚ โ”‚ โ”œโ”€โ”€ AppendController.java
โ”‚ โ”‚ โ””โ”€โ”€ ReadController.java
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ server/
โ”‚ โ”‚ โ”œโ”€โ”€ TabletServer.java
โ”‚ โ”‚ โ”œโ”€โ”€ LeaderReplicator.java
โ”‚ โ”‚ โ””โ”€โ”€ ReplicaReceiver.java
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ tablet/
โ”‚ โ”‚ โ”œโ”€โ”€ Tablet.java
โ”‚ โ”‚ โ”œโ”€โ”€ TabletRouter.java
โ”‚ โ”‚ โ””โ”€โ”€ TabletRegistry.java
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ storage/
โ”‚ โ”‚ โ”œโ”€โ”€ AppendOnlyLog.java
โ”‚ โ”‚ โ”œโ”€โ”€ LogSegment.java
โ”‚ โ”‚ โ””โ”€โ”€ LogReader.java
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ recovery/
โ”‚ โ”‚ โ””โ”€โ”€ LogReplayService.java
โ”‚ โ”‚
โ”‚ โ””โ”€โ”€ MiniLogStoreApplication.java
โ”‚
โ”œโ”€โ”€ src/main/resources/
โ”‚ โ””โ”€โ”€ application.yml
โ”‚
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ pom.xml

๐Ÿ› ๏ธ Tech Stackโ€‹

  • Java: OpenJDK 23.0.1
  • Framework: Spring Boot
  • Build Tool: Apache Maven 3.9.11
  • I/O: FileChannel for sequential disk writes
  • Networking: gRPC for inter-node communication
  • Protocol: Client -> REST (HTTP) | Leader - Followers (gRPC)

๐Ÿš€ Performance Characteristicsโ€‹

  • Sequential disk writes
  • Append-only storage
  • No random disk access
  • Lock-free write path
  • Horizontal scalability via partitions

๐Ÿงช What This Project Intentionally Excludesโ€‹

  • Raft / Paxos
  • Exactly-once semantics
  • Transactions
  • Schema registry

These are excluded to keep focus on core log storage mechanics.


๐Ÿ”ฎ Future Enhancementsโ€‹

  • Log compaction
  • Segment rolling
  • Read replicas
  • Metrics & monitoring
  • Leader re-election

๐Ÿ“œ Licenseโ€‹

MIT